Hybrid Policies

Rather looking for one optimal policy (say, the optimal MPA), we could optimise the score where we can use every combination of simple policies. Finding the optimal hybrid policy involves maximizing within the Cartesian product of the parameter spaces of each sub-policy. This optimisation is computationally harder (because the space to search is larger) but it frees the researcher from having to pick the right policy class.

Mixed 2 Species - Hybrid Policy

Take the optimal 2-species TAC problem where blue and red fish live in the same areas and fishers cannot avoid catching both at once. When we try to maximise the score: \[ \text{Score} = \sum_{t=1}^{20} \text{Red Landings}_t + \text{Blue Biomass}_{20} \] the optimiser looking for the optimal TAC generates the following Bayesian posterior:

It’s L-shaped both because agents have no way to avoid blue to target red (since they live together) and because global TACs do not provide incentive to target anyway.

Combining instead three different policies:

  • TAC quota
  • MPA
  • Fishing Season

can we obtain a better score?
This is a 8 parameter optimisation problem:

Parameter Meaning
Quota Red Fishery-wide maximum of red fish landed before fishery is closed
Quota Blue Fishery-wide maximum of blue fish landed before fishery is closed
Season Length Maximum number of days before fishery is closed
\(\text{MPA}_x\) X coordinate of top left MPA corner on Map
\(\text{MPA}_y\) Y coordinate of top left MPA corner on Map
MPA height Heigth (in cells) of the MPA
MPA width Width (in cells) of the MPA
MPA duration Number of days within a year where the MPA is active

While having more parameters should help in improving the score, we might intuitively assume policy hybrids are not helpful in this scenario for two reasons. First, quotas are already a way to define season length that is more flexible and accurate than day limits and second, both species are uniformly distributed throughout the map making MPAs look pointless.

Surprisingly then, the best hybrid policy scores on average 50% above the optimal TAC alone. Even more surprisingly the hybrid policy optimisation collapses into selecting an MPA-only policy. The optimal hybrid parameters found by the optimiser are:

Parameter Value
Quota Red 1,221,258
Quota Blue 1,997,840
Season Length 366
\(\text{MPA}_x\) 16
\(\text{MPA}_y\) 5
MPA height 40
MPA width 40
MPA duration 283

The optimiser effectively turned off the TAC by setting quota levels too high to ever bind. The optimiser turned off the fixed-season policy as well by setting longer than a year. The MPA is very large but there are 80 days where boats can fish in it. The next figure shows the position of the MPA:

The position of the temporary MPA when using the optimal hybrid policy

The position of the temporary MPA when using the optimal hybrid policy

The reason a spatial policy works so well is that it exploits, without being explicitly coded to do so, the logistic growth of the biology cells. Without regulations boats generate fishing fronts, depleting cells closest to port. Were the effort more dispersed, fewer cells would be emptied and recruitment would be higher for same amount of global biomass.
A temporary MPA is a way to somewhat disperse effort. When the MPA is open agents exploit it (since it is close to port and more profitable) but since the MPA is only ever open for a few months, the effect is never enough to cause major depletion. For the rest of the year agents fish the line around the MPA which causes some local depletion but this is tempered by the days agents spend fishing within the protected area.

The next figure shows red fish dynamics comparing a sample run using the optimal TAC alone versus using the optimal hybrid policy. There are always more red landings with the hybrid policy than with the optimal TAC. This, for the first 8 years, results in biomass being higher in the TAC-only scenario. However because effort is dispersed in the hybrid scenario, recruitment doubles (as most cells have some biomass spawning there). The result is that while more red fish is landed, red biomass is 1M units higher by the end of the simulation when using the hybrid policy.

Select dynamics comparing the optimal TAC and the optimal hybrid policy sample runs

Select dynamics comparing the optimal TAC and the optimal hybrid policy sample runs

The “hybrid” policy has a large advantage in terms of score against other policies as shown in the next figure.

The average score obtained by running the same mixed scenario 100 times with each policy.

The average score obtained by running the same mixed scenario 100 times with each policy.

Separated 2 Species - Hybrid Policy

We want to maximize the same score: \[ \text{Score} = \sum_{t=1}^{20} \text{Red Landings}_t + \text{Blue Biomass}_{20} \] Where blue fish lives south and red fish lives north.
We have seen this example when studying optimal ITQs and we know the posterior looks like this:

Where the highest score is achieved by setting blue quotas to 0 and aggregate red quotas to 364730.3 units.

Can we achieve better by using a hybrid policy? The answer is yes, and interestingly we can do so without quotas just by mixing an MPA and a fixed term seasons. The optimal parameters found by the Bayesian optimiser are:

Parameter Value
Quota Red 1,121,503
Quota Blue 1,228,868
Season Length 118
\(\text{MPA}_x\) 0
\(\text{MPA}_y\) 24
MPA height 40
MPA width 40
MPA duration 365

Which correspond to a year-round MPA covering the entire area where blue fish live and a short season length to fish the remaining (red) areas. The quotas are set at levels so high that they never bind.
This hybrid solution combines the obvious effect of protecting blue fish by an MPA and deciding that effort control in days returns a better long-term yield than fixed yearly quotas.

The average score obtained by running the same scenario 100 times with each policy.

The average score obtained by running the same scenario 100 times with each policy.

MPA-only rules do not perform as well in the geographically separated scenario as they did in mixed species simulations. The optimal policy suggested is similar, however, with a large MPA that is open for the last 100 days of the season. Boats fish the line for the remaining 265 days of the year.

Parameter Value
\(\text{MPA}_x\) 1
\(\text{MPA}_y\) 10
MPA height 40
MPA width 40
MPA duration 265

Penalized Hybrid Policy

While both examples above showed the optimiser turning off some policies to achieve the best score, this is not usually the case. The policy suggested by the optimiser could be an unworkable mix of policies that would be too hard to implement in the real world. We can tweak this procedure to “penalize” for hybrid policies that are too complex. There are fundamentally 4 ways to do so:

  1. Add a theoretical penalty to the score function to lower the size of the parameters. This procedure would work much like regularization in statistics
  2. If a cost function exists for the implementation of each policy, the cost can be added to the score function
  3. Turn the problem in a multi-objective optimisation where one of the dimensions is the complexity of the hybrid policy
  4. Turn the problem in a multi-objective optimisation where one of the dimensions is the actual cost of implementing the hybrid policy